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Abstract 

In this work we propose a hierarchy of Monte Carlo methods for sampling 
equilibrium properties of stochastic lattice systems with competing short and 
long range interactions. Each Monte Carlo step is composed by two or more 
sub - steps efficiently coupling coarse and microscopic state spaces. The 
method can be designed to sample the exact or controlled-error approxima- 
tions of the target distribution, providing information on levels of different 
resolutions, as well as at the microscopic level. In both strategies the method 
achieves significant reduction of the computational cost compared to conven- 
tional Markov Chain Monte Carlo methods. Applications in phase transition 
and pattern formation problems confirm the efficiency of the proposed meth- 
ods. 
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1. Introduction 



Our primary goal in this work is to develop a systematic mathematical 
and computational strategy for accelerating microscopic simulation methods 
with competing short and long range interactions, arising in numerous phys- 
ical systems for instance in micromagnetics, models of epitaxial growth, etc. 
We propose the Multilevel Coarse Graining Monte Carlo (Multilevel CGMC) 
method, based on a hybrid statistical mechanics and statistics approach. The 
method introduces a hierarchy of Markov Chain Monte Carlo methods cou- 
pling scales and types of interactions that can sample the exact or controlled 
error approximations of Gibbs measures fiN^(da) = Z~^ L e~^ HN<y(J ' ) P^ido) de- 
fined on a high dimensional space, N » 1, that can be easily generalized to 
any probability measure with similar properties. It is a method of construct- 
ing efficient proposal measures in Metropolis sampling using coarse-graining 
techniques, see also ([7]) below, aiming at reducing the rejection rate and the 
computational complexity. The key idea is a decomposition of the sampling 
distribution to a product measure 

^NA da ) = ftM,p( d v)M d(7 \v) , (i) 

with 7] := Tcr a variable with less degrees of freedom compared to cr, de- 
fined by a projection map T : Tj N — > S M , M < N. p{drj) is a measure 
with a simple explicit representation approximating the marginal jJLM,p(drj) = 
Hn,p ° T~ l (dri) and v r (da\rj) is a uniquely defined (prior) measure, respon- 



sible for reconstructing variables a given rj, [19(. Such a two-level measure 
decomposition can be trivially extended to a multi-level setting where 01) 
can include different resolution levels interpolating between a coarser level 
and the microscopic one a. 

In the following we describe a Monte Carlo step of a two-level CGMC 
method: 

1. Sample 77 from fiMs(dr}) : using Coarse Grained (CGMC) samplers 



— I r— 1 ■ ■ >»,P\ 

181 . 1171 ] . Appropriate coarse grained measures have been evaluated in 



earlier work via cluster expansions that can be easily constructed with 
available analytical error estimates ensuring that such approximations 



are controllable 20 



Conditioned on such 7], obtained in Step 1, we sample v r {da\rj) via an 
Accept/Reject method. 
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A schematic description of this procedure can be seen in Figure [TJ Better 
proposals constructed in Step 1 will lead to fewer rejections in Step 2, fur- 
thermore, there is no need to consider all possible microscopic proposals since 
at the coarse step we do a first screening. Instead of further approximating 
the sampling measure, as is done with cluster expanding (jSj) keeping the typ- 
ically computationally expensive multi-body higher order terms [23] , we use 
the hybrid statistical and statistics approach that construct u r (da\i]). Even 
when CGMC provide less accurate approximations, the multilevel CGMC 
approach can refine the results by the Accept/Reject step in the finer space. 
A key issue is how to sample conditionally from v r in an efficient manner in 
Step 2, picking an appropriate coarse proposal in Step 1. 

The necessary ingredient for the applicability of the method is a decompo- 
sition of the form ((TJ), which includes a possibly less accurate coarse-grained 
measure and the correcting accept/reject Step 2 above. This formulation can 
make the proposed method extentable to off-lattice systems where various 
coarse-graining schemes are already available 26|, |l2|, |2j, although without 



controlled-error approximations. In such off-lattice systems we typically 
have two main features: a presence of short and longer interactions, as well 
as comparable energy and entropy, hence fluctuations are expected to be 
important in the modelling and simulation. 

Systems with smooth long or intermediate range interactions are well 



approximated by coarse graining techniques [171 . Il9l . |24I |. and CGMC are 



reliable simulation methods with controlled-error approximations, both for 



observables and loss of information [21|, |20jj. Furthermore, models where 
only short-range interactions appear are inexpensive to simulate with con- 
ventional methods. However, when both short and long range interactions 
are present, the conventional methods become prohibitively expensive, and 
CG error estimates are not applicable. The proposed method can handle 
such systems efficiently by either compressing only the long range interac- 
tions, and sample with CGMC with low computational cost, at Step 1 and 
include the short range part at the accept/reject Step 2, or compress all types 
of interactions for Step 1, and correct appropriately in Step 2. 

A wide literature exists on sophisticated Markov Chain Monte Carlo 
(MCMC) methods designed to accelerate simulations for large systems, ap- 
plying for example parallelising techniques and/or constructing good first 
approximations (proposals) in Metropolis sampling (§], 31]. In 10 1 Efendiev 
et.al., propose the Preconditioning MCMC, a two stage Metropolis method, 
applied to inverse problems of subsurface characterization. Our algorithm 
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shares the same idea of constructing a proposal density based on meso 
/macro-scopic properties of the model studied and taking advantage of the 
first stage rejections. Several methods where the trial density is built up se- 



quentially with stage- wise rejection decision appear, [3[, 28]. There are also 



some similarities with simulated sintering, and transdimensional MCMC, see 



29l . |28| and references therein. However, the novelty of our method lies on 
the construction of the variable dimensionality (and level of coarse-graining) 
state spaces and the corresponding Gibbs measures relies on statistical me- 
chanics tools that allow a systematic control of the error from one level of 
coarse-graining to the next. The interplay of different levels of compressed 
spaces appears also in spatial multi-grid methods coupled with CGMC sam- 
pling, studied in 0] for accelerating lattice kinetic Monte Carlo simula- 
tions where however the proposed methods are not exact and do not neces- 
sarily provide controlled-error approximations. Various attempts appear on 
parallelising Monte Carlo simulations, based on a parallel resolution, such as 
parallel kinetic CGMC [l[ and a combination of CGMC and parallel tem- 



pering [35|. In a follow up work we extend our framework for multilevel 



CGMC to accelerate sampling of evolution processes on large lattice systems 



14j . There we develop multilevel Kinetic Monte Carlo algorithms, based 
on the known from coarse graining techniques different level approximating 
processes. 

In Section [2] we introduce microscopic lattice systems and provide a brief 
review of coarse graining methods. We also provide the Metropolis-Hastings 
MCMC method for numerical simulations and describe microscopic pro- 
cesses. We introduce the multilevel CGMC (ML CGMC) method in Section [3] 
and provide the mathematical analysis that ensures the theoretical validity of 
the method. Section [^provides a full comparison of the computational com- 
plexity between the classical and the multilevel Metropolis introduced here. 
More specifically, in Theorem [2] we prove comparative estimates on their 
mixing times via spectral gap estimates. In turn such spectral estimates are 
obtained through suitable relative bounds on their respective Dirichlet forms. 
Concluding, our analysis shows that ML CGMC has substantial savings over 
the classical MH algorithm, generating cheaply proposals with small or con- 
trollable rejection rates. Sections |5] and |6] give example applications of the 
multilevel method in canonical and microcanonical sampling. In Section [5TT1 
a benchmark example is employed in order to demonstrate an explicit ap- 
plication of Theorem [2j In subsection 15.21 an order one improvement of the 
coarse graining error in a phase transition regime is achieved when applying 
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the multilevel CGMC method with a potential splitting approach, for a Kac 
type potential with algebraic decay. Finally in Section we study nanopat- 
tern formation in surface diffusion induced by a Morse type potential, and 
verify that the proposed method can provide correctly microscopic details. 



2. Stochastic lattice systems at equilibrium 

We consider an Ising-type system on a periodic d- dimensional lattice 
with N = n d lattice sites. At each site x G Ajv we define a spin variable o~(x) 
taking values in a finite set. For instance, in a lattice gas model a(x) G {0, 1} 
describes that the site x is vacant or occupied by an atom. The state of the 
system is described by a configuration a G T, N = {0, 1} A]V . The interaction 
energy of the system, e.g., interacting particles in the lattice gas model, is 
defined by the Hamiltonian H^. We assume systems where the particles 
interact only through a pair-wise potential and thus the Hamiltonian takes 
the form a = {cr(x) : x G A^} 

h n {°) = -\zZzZ j ^~ vM x Hv) + E Kx)°(?) , ( 2 ) 

where h is an external field. Equilibrium states at the inverse temperature /3 
are described by the (canonical) Gibbs probability measure on the space £ n 

fi Nt p(da) = Z N l e-? H ^ P N (da) , (3) 

where Zn is the normalizing constant (partition function). Furthermore, 
the product Bernoulli distribution P^(da) is the prior distribution on A^ 
representing distribution of states in a non-interacting system, or equivalently 
at (5 = 0, when thermal fluctuations-disorder-associated with the product 
structure of PN^da) dominates. By contrast at zero temperature, (3 = oo, 
interactions and hence order, prevail. Finite temperatures, < /3 < oo, 
describe intermediate states, including possible phase transitions between 
ordered and disordered states. 

The coarse-graining techniques have been developed in order to study the 
behaviour in the regimes when the size of the system N — > oo. In the series of 



papers [171, |18|, |25| the authors initiated the development of coarse- graining 
(CG) as a computational tool for accelerating Monte Carlo simulations of 
stochastic lattice dynamics. The coarse-grained model is constructed on a 
coarse grid by dividing into M coarse cells, each of which contains 
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Q (micro-)cells, typically Q = q d with the coarse-graining scale q in each 
dimension. Each coarse cell is denoted by C^, 6 A^. A typical choice for 
the coarse variable in the context of Ising-type models is the block-spin over 
each coarse cell Ck, 



V 



i r}(k) = a ^ x ) : k e A m \ 



defining the coarse graining map T : XV — > £jw> Tct = rj. The exact 
coarse-grained Gibbs measure is given (with a slight abuse of notation) by 
P>M,p — Hn,p °T _1 , written in a more convenient form 

fiuAdv) = -^e-P R ^)p M ( dr]) ; (4) 

where pM{dr]) = P^ o T _1 defines the coarse-grained prior measure. The 
exact coarse-grained Hamiltonian is defined by the renormalization group 
map, see, e.g., 

e -pS M (rf) = E [e-^^] = J e-P HN ^P N {da\ri) . (5) 

The conditional prior P^(da\rj) is the probability of having a microscopic 
configuration a, given a coarse configuration r\. 

Although typically Puidi]) is easy to calculate the exact computation of 
the coarse-grained Hamiltonian Hm{v) given by ([5]) is, in general, impossible 
even for moderately small values of N. Therefore suitable approximations 
have to be constructed. An initial a ppr oximation can be then proposed by 
the technique of cluster expansions, |34j providing improved approximation 
at suitable phase regimes. The corresponding first order CG Hamiltonian is 
explicitly given, [20], 



m°\ v )= - ~ J2 £ J (MM - ^(0, 0) £ V(k)(v(k)-1) 

keA M l + k keA M 

+ E HQ , (6) 

feeA M 

where the coarse-grained interactions are evaluated explicitly by averaging 
over the cells k,l G Am 

q xeC k yeCi VW > x&C k y€C k ,y^x 
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defining the coarse grained Gibbs measure 



AW = P^ ( ° )( ^mN. (7) 

The coarse-graining of systems with purely long- or intermediate-range in- 
teractions was studied using cluster expansions in 0, 0, [l9[ 22]. In many 
applications the long-range potentials exhibit scaling property 

J(x-y)=L- d V (j\x-v\) , x,yeA N , (8) 

where V G C 1 ([0,cxd)) and it is normalized to ensure that the strength of 
the potential J is essentially independent of L, i.e., J2x^o J( x ) ~ Jo°° V( r )dr. 
The constant L can be interpreted as a (characteristic) interaction range of 
the potential. For example, if we have V with properties V(r) = V(—r), 
V(r) = 0, |r| > 1, then a spin at the site x interacts with its neigh- 
bours which are at most L lattice points away from x. One of the results 
therein is on deriving error estimates in terms of the specific relative entropy 
lZ{ji\u) := N^ 1 J2 a l°g {/ i ( cr )/ z/ ( (T )}A i ( (T ) between the corresponding equi- 
librium Gibbs measures. Note that the scaling factor iV -1 is related to the 
extensivity of the system, hence the proper error quantity that needs to be 
tracked is the loss of information per particle. 

^(4?>^ o T- 1 ) = G(e 2 ) , e = fllWUi (|) , (9) 

Systems with short and long range interactions. One of our goals in this 
work is to study systems where in addition to the long-range potential we 
have a short range 

K(x -y) = S- d V s (||a; - y|) , x, y e A N , (10) 

with V s having similar properties as V in (jHJ) and S L distinguishing 
the short and long range nature of interactions. A typical case of a short- 
range potential is encountered in the nearest-neighbour Ising model where 
Kix — y) = K = constant, for \x — y\ = S = 1 and zero otherwise. The new 
Hamiltonian including the contributing energy from the short range potential 
H s (a) and the long range part Hi (a) is 

H N (a) = H s (o-)+H l (a). (11) 
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Study of equilibrium properties of stochastic lattice systems mainly accounts 
for the evaluation of averages over the coarse-grained JjLM,p{drj) or the micro- 
scopic ^N,ii{da) Gibbs measures of observables : — > R, i.e., 



J Sjv 

Numerical methods evaluating equilibrium averages are the Markov Chain 
Monte Carlo (MCMC) methods, am ong which the most widely used is the 



generates proposals a', for the evolution from the configuration a to a', that 
are defined by the proposal probability transition kernel p(a',a). The pro- 
posal a' is accepted with probability a(a, a') or rejected with the probability 
1 — a(a,a'). Let X = o~q be an arbitrary initial configuration, the n-th 
iteration of the algorithm consists of the following steps 

Algorithm 1 (Metropolis-Hastings algorithm). Given X n = a 

Step 1 Generate X' n = a' ~ p(o~',o~). 

Step 2 Accept-Reject 



with the acceptance probability depending on the energy difference be- 
tween configurations a and a', AH N (a,a') = H N (a') — H N (a), 



The algorithm generates an ergodic Markov chain {X n } in the state space Sjy, 
with the stationary measure p^^^da). Ergodicity ensures the convergence 
of empirical averages - X)ILi 4>{Xi) to the desired mean E [</>], for any G 
L 1 (pn,p)- It is easy to deduce the probability transition kernel associated to 
MH Algorithm Q] 





Metropolis-Hastings algorithm 




X' n = a' with probability a(a, a'), 
X n = a with probability 1 — a(a, a') 




/C c (cr, a') = a(a, a')p(a, a') + 1 



L 



a(a,a')p(a,a')da' 5 (a' — a) . (12) 



where 5 denotes the Dirac measure. 
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Depending on whether one considers microcanonical or canonical ensem- 
ble the ergodic Markov chain {X n } can be defined by spin-exchange dynam 



ics that preserve the order parameter or spin-flip dynamics respectively [27 
0. In the spin-exchange the proposed new configuration a' = o~( x ' y ^ is ob- 
tained from a by interchanging the spins at x and y, for nearest-neighbour 
sites x and y. 

{a(y) , when z = x, 
<j(x) , when z = y, 
a{z) , otherwise. 

Analogously for the spin-flip a' = is obtained from a by flipping the spin 
value at site x 

1 — o~(x) , when z — x, 
a(z) , otherwise 



a ix \z) 



3. The multilevel CGMC Metropolis-Hastings method 

We present in detail and gen eralize the Coupled Metropolis-Hastings 
method originally proposed in [15j. We introduce a multi-level Metropolis- 
Hastings method, where each level corresponds to a configuration space res- 
olution level, and provide the associated mathematical analysis that ensures 
the theoretical validity of the method. 

Let {SMj}j=o denote a hierarchy of coarse spaces derived from the mi- 
croscopic space Stv =: Em by a family of mappings Tj : — > 
Tjf]j = rjj + i, for iV = M > Mi > • • • > Mj. The variables rjj denote con- 
figurations on spaces Em-> with t] =: a referring to the microscopic variable 
on T, N . The method is composed by a sequence of I + 1 Metropolis-Hastings 
steps each one designed to generate samples from E^f . given a coarser sample 
from Em j+1 - Properly constructed measures on Em^ , j > form the basis for 
constructing efficient proposal kernels for Metropolis-Hastings algorithm al- 
lowing sampling of large systems. The interplay between different resolution 
spaces is controlled by the hierarchy of projection mappings Tj and corre- 
sponding inverse mapping procedures (reconstruction). Appropriate recon- 
struction measures u rt j(dr]j\r]j + i) are constructed in view of decompositions 
similar to (TT3"]) . for the each pair of Gibbs measures fij(dr]j) and fij + i(dr]j +1 ), 
see Figure [TJ For the sake of exposition we describe in detail the two-level 
CGMC method, in which two different scale state spaces Ej^. = Ejy and 
E Mi = E M , M < N are involved. 
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Figure 1: Schematic of a two- level ML CGMC. Information exchange between coarser 
and finer resolutions. Step 1: Sample Hj+i(dr)j-\-i), Step 2: Reconstruct with v r (dr)j\rij+i) 
such that ^,j{drjj) = ^,j + i(drjj + i)v r {drjj\rij + i). 

The ML CGMC method is a hybrid statistical mechanics and statistics 
approach based on a product measure decomposition of the target distribu- 
tion 

/ijv,/3(dcr) = j$j{drj)v r {d(j\rj) . (13) 

The principal idea is that computationally inexpensive CG simulations will 
reproduce the large scale structure and subsequently microscopic information 
will be added through the microscopic reconstruction. The two levels of the 
method are described by: 

1. Marginal approximation and CGMC sampling. fr^Jdrf), given in (j7|), 
an approximation of marginal p,M,p — A i A r ,/3°T~ 1 , is described explicitly 
with a controllable error [ip| . 

2. Microscopic reconstruction. Conditioned on rj, obtained by sampling 
from p,y p(drj), we sample v r {da\rj) via a simple inverse-mapping dis- 
tribution iA r (da\rj) and an Accept/Reject method. Measure v r (da\rj) is 
responsible for the reconstruction procedure of the fine configuration a 



10 



constrained on the coarse configuration r\. 

More specifically, reconstruction is the reverse procedure to coarse-graining, 
i.e., reproducing microscopic properties directly from CG simulations. A 



detailed discussion on reconstruction can be found in |19l . |15| . Note that 
h> r {da\rj) is a finite measure, uniquely defined by f fT3|) given iiN,p{.da) and 

In view of Decomposition (|T3|) we propose the two-level CGMC algo- 
rithm composed of two Metropolis-Hastings steps. The first step samples 
the measure Ji^j ' pidrj) ([7]), on the coarse space Em, using an arbitrary pro- 
posal transition kernel p(r], 7/) to produce coarse trial samples 77'. The second 
step, performed only if the coarse trial sample is accepted, consists of the sim- 
ple reconstruction from the coarse state 7/ with fi r (da\rj) and an accept/reject 
method. If a trial coarse sample is rejected the second step is not performed 
and no computational time is wasted on checking fine trial samples that are 
most likely to be rejected. The two-level CGMC algorithm is defined as 
follows: Let Y = o"o be an arbitrary initial configuration, for n = 0, 1, 2, . . . , 

Algorithm 2 (Two-level CGMC MH Algorithm). Given Y n = a 
Step 1 Compute the coarse configuration n = Ta. 

Step 2 Generate a coarse sample rf ~ p(j],r}'). 

Step 3 Coarse Level Accept- Reject. 
Accept rj' with probability: 

accM) = min {l, e -^ (0) ^)gj^ } . (14) 

if rj is accepted then proceed to Step 4, 
else generate a new coarse sample Step 2. 

Step 4 Reconstruct a' given the coarse trial rj', 

Step 5 Fine Level Accept- Reject. 
Accept a' with probability 

a f (a,a') = min (l, e -^ N (^')-^ M) ] Jt^LX . (15) 
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where AH N {a, a') = H N (a') - H N (a), and Aif(°% 7/) = H^{r]') - H^{rj). 

In terms of Metropolis-Hastings Algorithm [TJ the method is generat- 
ing trial samples a' from proposal kernel p(o~,o~'), with stationary measure 
Pm B{^v)l J 'r(da\r]), that depends on the statistical mechanics properties of 
the system. This fact leads to an increase of the acceptance rate of the MH 
method, see for example Figure HH Indeed, consider the canonical ensemble 
for x G A N such that x G Ck, k G Am with a potential J(x — y) where 



the following estimate holds [18 



AH N {a,a^)-AH^{ V , V ^) = 0(Q/{2L + l) d ) , 

with This estimate shows that the second level acceptance 

probability a/(cr, a') is controlled by the coarsening parameter, i.e. it is close 
to 1 for Q/(2L + l) d << 1 and most of the coarse samples entering the second 
level will be accepted. Thus, for fixed L, varying the coarsening parameter we 
can control the effectiveness of the method as a balance between acceptance 
rate and computational cost. That is between small values of Q, leading 
to high acceptance rate, and larger Q, leading to lower computational cost 
at sampling at the coarse space. We elaborate with this issue in detail in 
Section HI 

3.1. Examples of multi-level decompositions 

In this section we elaborate on how to select the prior measure v T (do~\rj), 
that furthermore determines how to perform a multi-level decomposition of 
the Gibbs measure. The selection of v r (da\rj) depends on two features (a) the 
choice of the coarse grained measure fi^ p(dr}), that is whether we compress 
the entire interaction potential or split to a short and long range part and 
(b) on sampling the exact or an approximation of the microscopic measure 
yLN,p{da). 

Corrections. With this strategy all types of interactions are compressed and 
incorporated into the first level of sampling with H^°'(r]), as is depicted in 
the algorithmic description of the method Algorithm |5J The reconstructive 
and corrective characteristics of v r {do~\r]) appear explicitly in the following 
formulation, that is reconstruct with the uniform conditional distribution 
li r [da\rj) = PN(da\r)) and correct with H^{a) — H^ \rj), 

v r (da\ V ) = Z { M ] Z N l exp{-/3[H N (a) - # (0) (r/)]} P N {da\ V ) . 
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Potential splitting. An alternative approach is considered based on splitting 
the inter particle interactions into short- and long-range terms. A decom- 
position of the coarse-graining of the interaction potential can be justified 
and optimized by known error estimates, see [2J. These estimates suggest a 
natural way to split the potential into a short-range piece K(x — y) with pos- 
sible singularities and a locally integrable (or smooth) long-range decaying 
component J(x — y). 

Here we suggest to sample on the coarse step according to the effective 
Hamiltonian Hj (77), corresponding only to the long-range depended energy 
Hi(a), as in (TTT1) . that suggests a rearrangement of decomposition ({TBI) where 

u r (da\ V ) = Z^Z- 1 exp{-/3[H s (cx) + H t (a) - ( V )]}P N (da\ V ) . 

In the two-level CGMC sampling the costly long-range part is involved only 
in the coarse updating where the number of operations to calculate energy 
differences is reduced and coarsening of the short-range potential is avoided 

3- 

Approximate CG. In many applications where meso/macroscopic infor- 
mation is sufficient CGMC sampling is reliable for long-range potentials and 
the error when neglecting terms AHi(a,a') — AH^^,^') is small. In this 
strategy we suggest to neglect these terms, despite the encountering of the 
approximating error, benefiting from a further reduction of the computa- 
tional complexity, see Section HI As a result Algorithm [2] is sampling from a 
probability measure approximating fi Nt!3 (da), 

I^nA^) k exp{-PH 8 (c) - PH[ 0) (r t )}P N {da\r t )jif] l3 (dri) . 

3.2. Reversibility of the Multilevel CGMC 

Mathematical analysis for the two-level CGMC method is studied in this 
section for a broad class of probability measures. The method can be gen- 
eralized to the sampling of any probability measure fi(da) on a countable 
configuration space E, by coupling properly configurations between a hierar- 
chy of coarser and finer spaces. Since almost any probability density /i(cr) can 
be written in the form e~ H ^ the method can applied to any model for which 
one can properly define a function H(a) and the hierarchy of coarse spaces 
and densities. The following arguments are straightforward but necessary 
to prove that the algorithm samples processes with the desired stationary 
measure. Algorithm [2] is defined by the acceptance probabilities 
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and 

after, cr = mm < 1, , N ; — - — ; — ; — - > , 17 

where with a slight abuse of notation we denote the probability density of 
measures with the same letter. Note that in the sequel rj = TV, rf = Tcr', if 
not otherwise stated. A trial state a' is generated by the first level and the 
simple reconstruction with the transition probability 

Q(a,a') = a C G(v,v')p(v,v')^r(^'\v'), for ° ^ °' ■ 

It is easy to check that Q(a,a') satisfies the detailed balance condition, 
see Definitional with po(da) = p^ (drj) p r (da\rj) . i.e. Q(a, a')po(a) = 
Q((j', a)fi (a'). This property lead to the simplified formulation of the second 
level acceptance probability (TTTT) from the one suggested by the Metropolis 

method otf(a,cr') = min jl, ^J^^^ j. The probability of moving from 

a state a to a next cr' is af(a,a')acG(ViV')t J >r( crf \v')p( r )i' r }')- Therefore the 
method generates a Markov chain {Y n }, starting from an arbitrary initial 
state o"o, with transition kernel 

Kcg{v,v') = Uf{a,a')acG{v,i)v<-r{v'\i)p{v,'n') for cr 7^ cr' , (18) 
fccG(°,v) = 1 - J^af(^^')acG(v,V')Pr(cr'\ri')p(r],r]')da' . 

We denote E = {a G £ : //(cr) > 0} and E = {a G £ : // (cr) > 0} the 
support of the microscopic and the proposal distributions respectively. 



With the following Theorem, the proof of which is given in Appendix A 



we prove that transition kernel JCcg(&, °~') satisfies the detailed balance con- 
dition, that ensures the method generates samples from the target mea- 
sure. Furthermore, irreducibility and aperiodicity properties are satisfied 
that guarantee ergodicity of {Y n }, i.e. - Y^=i /Oj') * s a convergent approx- 
imation of the averages J f(a)p(da) for any / G L 1 (p). 

Theorem 1. For every conditional distribution p(t),t)') on S ; and p r (-\rj) on 
{ff G S: Tcr = rj}, 

i) The transition kernel /Qx^cr, cr') satisfies the detailed balance (DB) con- 
dition with //(cr). 

ii) //(cr) is a stationary distribution of the chain. 

iii) If p(r], rj') > 0, p r (a\r]) > for all a, a' G E and E C E holds, then {Y n } 
is p -irreducible. 

iv) {Y n } is aperiodic. 
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4. Computational Complexity of Multilevel CGMC 

The effectiveness of the multilevel CGMC method is a result of the syn- 
thesis of the following two arguments. Firstly, the computational cost of a 
conventional MH method is reduced by a two-level CGMC method. How- 
ever this is not enough to prove that the method can indeed accelerate con- 
ventional methods, and a mathematical spectral analysis emerges as a key 
algorithmic need. Therefore, secondly, Theorem [2] provides the relation of 
the two methods equilibration times using spectral arguments. 

4-1. Computational complexity 

An abstract comparison of the computational complexity of a conven- 
tional MH and the two-level CGMC is summarized in Table HJ for sampling 
a Gibbs measure with Hamiltonian By computational complexity here 
we mean the cost of calculating energy differences involved at the acceptance 
probabilities. The following analysis is based on the approximate CG strat- 
egy with potential splitting as described in subsection 13.11 Let us consider 
the canonical ensemble where energy difference is 

AH N (a, a x ) = (2a(x) - 1) ]T [K(x - y) + J(x - y)a(y)} . 

For potential J{x — y) with interaction range L each particle interacts with 
a number of (2L + l) d neighbours and similarly for K(x — y) with range 

5. Therefore the number of operations necessary are (2L + l) d + (2S + l) d . 
Similarly the energy differences appearing in the two-level method are 

AHP( v , v k ) = J2 J(kJHi) + J(o,o)(v(k)-i), 

AH s (a,(j*) =(2a(x)-l) £ K(x - y)a{y) , 

at the first and second level respectively. The compressed interaction po- 
tential J(k,l) has the reduced range L/q and the number of operations for 
calculating AH^ (r],r] k ) is (2L + l) d /Q, while for AH s (a,a x ) is (2S + l) d /Q. 
The method, in addition to the reduction of operations due to range suppres- 
sion, exhibits a computational reduction as a result of the fact that rejected 
trials at the first level will not be tested at the second, and the calculation of 
differences AH s (a,o~ x ) is avoided. A summary of this discussion appears in 
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Table 1: Operations count for evaluating energy differences for n iterations. The total 
number of accepted coarse trials m is the number of the second level iterations tested. 

Method Operations 

Metropolis Hastings n x (2L + l) d + n x (25 + l) d 
Two-level CGMC n x (2L + l) d /Q + m x (2S + l) d 



Table [TJ When the correction terms AHi(o~,o~ x ) — (i],r] k ) are present 

the additional computational cost in the second level is small, dependent on 
the decay of rate of J(x — y) — J(k,l), x, y G £at,x G Ck,y G C\. Since 
this term is in principle fast decaying, in implementations we can neglect 
interactions with distance larger than a cut-off range L c < L with a small 
error, that will contribute further in the computational time reduction. 

4-2. Comparison of equilibration times 

A direct estimation of rate of convergence of a Markov chain generated 
by a Monte Carlo method is model dependent and in general intractable. 
Thus for the purpose of this work it is natural to study this property as a 
comparison of the proposed method with a conventional method. Theorem [2] 
provides such a comparison of the spectral gap between the conventional MH 
method and the two-level CGMC method, for sampling a measure n(da) on 
a countable state space S. This comparison is summarised in inequality ( |20l) 
proving that their relation, in terms the spectral gap, is controlled by the 
approximation ft^(dr]) of marginals fi(dr]) = /ioT 4 ( drj). 

For a discrete time Markov chain {X n } with transition kernel /C and 
stationary distribution fi, the mixing time r is defined as 

t := min n jva G £ : ||/C n (a, ■) - ^{-)\\tv < ^ j ■ 

For two probability measures fi, v the total variation norm is — v\tv = 
\ J2 a IM ") — u ( a )\- Bounds of the total variation norm appearing can be 
gjyen in terms of /C's spectral gap A, for example for a reversible kernel holds 

2\\K n {a r )-^\\ TV <-^—{l-\r. (19) 

The spectral gap of a kernel /C is defined by X(fC) = min j var '(/) ' ^ ar (/) ^ ^} 
with the Dirichlet form £(f, f) = \ ^ ctct , \f(a) - fa')\ 2 IC(a, a')n{a) and the 
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variance Var (/) = | J2aa' Ifi 17 ) ~ f( a ')\ 2 f 1 ( <T ')f J '( <T )- I* 1 view °f (JXSJ) , one can 
say that between two algorithms producing Markov chains with identical 
equilibrium distributions better in terms of speed of convergence is the one 
with the larger spectral gap. 

Let £(JCcg)i£(K-c) denote the Dirichlet forms and A(/Ccg), A(/C c ) the 
spectral gap corresponding to the two-level CGMC and the classical Metropo- 
lis transition kernels /Ccg(c", cr') ( TT8T) and /C c (a, a') ( fT2l) respectively. 

Theorem 2. Lei p(<r, a') be a symmetric proposal transition probability for 
the conventional MH algorithm and p(j], rj') a symmetric proposal transition 
probability on the coarse space S for the two-level CGMC algorithm, then for 
any conditional probability fi r (a\r]) 

Aj_\(}C c ) < \{K CG ) < tA(/C c ) (20) 

where A = xal a>a '{A(cr, a')} and 7 > 0,7 > such that 7 < B(a,a') < 7, 
with A(cr, a') and Bio, a') defined in lemma 

Remarks. 

1. Existence of finite and positive values of A = m.{ aa i{A(o, cr')} is en- 
sured for the models studied in this work. Consider the Gibbs mea- 
sure jj, N ^{da) oc exp{— (3H N (a)} and p,^} g(drj) oc exp{— (3H^(rj)} as 
defined in Section [2] Let J(r), s.t.J2 r J( r ) = J* < 00 with the com- 
pressed interactiosn J as defined in ([6]). All possible values of A(o~, a'), 
AH, are A(a,a') = 1, 



A(a,a') = min{e -^H m M)^ e -HAH(°Hv',v) } > e -pJ* . 

and 

A(a, cr') = m i n { e -mHN(^')-AH(°l( v , v ')) ^ e -p(AH N (a',<T)-ABm( V >,r,)n 

= 1 + °(pITTf)' 

This is a result of the known estimate Lemma (2.3) j2o|, 

Q 



AH N (a,a')-AH^(r]) = O ( 



(2L + 1) 



This proves also that values of -4(cr, a 1 ) are close to 1 controlled by the 
approximation parameter Q. 
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2. Term <B(cr, er') depends only on the difference between the proposal 
kernels in the two methods. For example if /j, r (a\rj) is chosen such 
that p(cr,a f ) = p(r],r]')fi r (a'\ri') for all a, a' G £ then B(a,a') = 1, and 
inequality ( 120]) depends only on ^4, that is 

A\()C C ) < \{K CG ) < A(/C c ) . 

Numerically the statement of the Theorem is revealed by a comparison of 
the average acceptance probabilities in examples, see Figures HI We expect 
that best approach in terms of equilibration rate is a rejection free method 
for sampling the CG distribution, which is an approach that we elaborate 
with in a forthcoming work. 

To prove Theorem [2] we need the following lemmata. 

Lemma 3. For symmetric proposal transition kernels p(a, a') = p(o~', er) and 
p(rj, 7]') = p(r/, rf), for all er, er' G £ with er ^ a' , 

/C CG (er, a') = A(a, o')B{a, a'))C c {a, er') 

where 

B , n _ Pjjhjf) jvrjcr'W), ifa f (a,a') = l 
[a ' a) p(o-,a')\p r (a\ V ), ifa f (a,a')<l ' 1 J 

Furthermore we define the subsets 

C\ ={(a,a')eY,xT l : {a < l,a C G < l,a/ < 1} or {a = l,a C c = l,a/ = 1}(F 22 ) 

C*2 ={(o~, cr')eEx£: {a = l,acG < l,e*/ = 1} or {a < l,acG = l,a/ < 1}} 

C3 ={(cr, ff')eSxS: {a = 1, QicG = l,a/ < 1} or {a < 1, QfcG < l,a/ = 1}} 

C4 ={(cr, ct')gSxS: {a < 1,«cg = 1 5 Q / = 1} or { a = ^-i a GG <!,«/< 1}} 



and 



■4(<T,<7') 



1, if(<ry)6Ci 

mm i A (0) (7? ) . A (o)(^)/' */ ^ ' ° 2 

min r M^)A (0) W K^trfl i ?Y fa a*) F r. 



(23) 



PROOF. Recall that /C c (er, er') = o;(er ) a')p(a, a'), for er 7^ er', which for the 
conventional MH method and the hypothesis of symmetry of p becomes 
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l,^}p((j,(j / ). Similarly the two-level CGMC kernel 
p(a')p {0) (vhr(a\ V ) 



/C c (cr, a') = min 
becomes 

x min 1 1, ^( )^j } Pfo t/KKIV) • 

The proof is based on a case study on the values of the acceptance proba- 
bihties a(a,a>) = min{l,^>}, a f (a,a>) = ^{l. ig^i } and 

acciViV') = min |l, ^(o/^ }• m the following we also use extensively the 
symmetry property of p and p. All possible combinations on their values are 
categorized in the sets Ci,i = 1, . . . , 4 defined in ( 1221) . For (a, a') G Ci and 
a(cr,a') < 1, a/(a,cr') < 1 and acciViV') < h JC c (a,a') = l ^p(a,a') and 

and we derive their relation /Ccg(c, cr') = ^^fy^/C c (cr, a'). With similar 
simple calculations, that we omit here, for all sub-cases that are encountered 
in subsets Ci, % — 1, . . . 4 we have, for (cr, a') G Ci 

£cg(o", a') = B(a, a')IC c (a, a') . 

with 

B{ a a') = ^hjfl \vr{p'\i)i if «/(^, O = 1 
p(a,a') \p r (a\ri), if a/(cr,cr') < 1 ' 

For (cr, a') G C 2 such that a(a, a') = 1, acciViV') < 1j ^/l ") °"') = 1; 

} = ^5 p^,^) — K ^ a } ' 

and for cv(a, cr') < 1, acG(ViV') = 1> a /(° r ? cr/ ) < 1? 

ILCG\,<7, O ) = _, , 7 A C (CT, (T ) , 
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that we can summarize to 

Kcg{<t, o ) = mm | B(a, a )K c (a, a ) , 

since for example in the first case jjj$yjQ < 1 < ^oj&k and inversely for the 
second. Following the same reasoning for (cr, a') G C3, 

K CG (a, a ) = mm j , ^y^) ) ( ' ) • 

and for (cr, a') G C4 

Kcaiv, a') = min ( 44 1 "O^cfa a') . 

All these steps prove, the following relation of transition kernels generated 
by Algorithms [I] and [21 

^CgO, °"') = O^fa 0-')^c(o-, cr') , 

with ^(tr, a') and fi(tr, a') defined in (|23) and (|2ip. □ 
The proof of Theorem [2] is based on the application of Lemma [3] and 
Lemma 3.3 in the work of P. Diaconis and L. Saloff-Coste js|] that is stated 
here for completeness. 

Lemma 4. Let /C,/i and K.',fi' be two Markov chains on the same finite set 
X. Assume that there exist A,a> such that 

£' < A£, a/i < // , 

then 

A'<-A. 

a 

We conclude with the proof of Theorem [2j 

Proof. We compare the Dirichlet forms £(1Ccg),£(^c) using the definition 
of Dirichlet form and applying Lemma fl3]). By the definition of A(a, a') for all 
a, a' G £ holds < A(a, a') < 1 and assume that A := inf CTj0 -/{^4(cr, a')} > 0. 
Then by Lemma |3] 

M{A(a,a')}B(a,a')]C c (a,a') < )C CG (a,a') < B(a,a')JC c (a,a') . 
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Let 7 > and 7 > such that 7 < B(a, a') < 7. Then 

mi{A(a,a')}'jK: c (a,a') < K. CG ((j,(j') < jfC c (a,a') . 

Recalling the definition of Dirichlet form for kernel JCcg, 

<t,<t' 

and the above relation we can write iaf{A(a, a')}^£ c < Scg < l&c- Applica- 
tion of LemmaHJ for which here // = fi, thus a = 1 and A = inf CTiCT /{.4(cr, c')}^f 
for the left hand side inequality, and A = 7 for the right hand side, gives the 
relation of spectral gaps 

mf{A(a,cr')h\(1C c ) < \{K CG ) < jX(1C c ) . 

□ 

5. Benchmark examples for the canonical ensemble 

5.1. Combined Ising and Curie Weiss model 

We consider a benchmark example of competing short- and long- range 
interactions, where constants A, and 7,7 appearing in Theorem |2] are calcu- 
lated explicitly. Furthermore analytical expressions for the free energy in the 
thermodynamic limit, iV — > 00, are known [16(] . The energy of the system at 
configuration a G Sat is 

H N (a) = -— E E a ( x ) a (y) - ^ E ^2 a ( x ) a (y) - h E a ^ 

x£A N \x—y\=l xeAjv y^x xeAjv 

= H s (a) + Hi(a) - h < x ) • 

The interactions involved in H s (a) are of the nearest-neighbour type with 
strength K, while H\(a) represents a mean-field approximation of a potential 
J(x — y) as in (|S]) for example, averaged over all lattice sites. The coarse 
grained Hamiltonian Hi(r]) is exact, equal to the microscopic energy Hi(a). 
Indeed, for any coarse graining level Q 

Hfa) = = ~ J2 E MM) -^nT, Vik) (v(k) - 1) , 

fceA M l£A M l^k k&A M 
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where r](k) = YlxeC a ( x )-> ^ ^ ^-M- No coarse graining approximation error 
is involved from compressing long range interactions, that allows us to study 
the effect of the splitting of short- and long- range interactions. We consider 
the canonical ensemble with the spin-flip dynamics. 

The detailed calculations on the application of Theorem [2] are given in 



Appendix B , where we prove that the constants appearing in inequality ( 12 Op 



are 

7 = 7 = 1, and 

_ f min{e-l J| , e" 2 ^} for K ^ 0, 
~ \ 1 for K = 0. 

The relation of the spectral gap of the methods is, for K ^ 0, 

min{e-l J ',e" 2|i "}A(/C c ) < \{K CG ) < A(/C c ) . 

For K = Theorem H] verifies the equivalence of the two methods \{1Ccg) — 
X(JC C ) as is expected. The dependence on the splitting of energy is revealed, 
showing that the governing parameter is the relative strength of the short 
and long range interactions J and K. 

Here we present simulation results for two dimensional lattice systems, 
while in [l5j one can find a detailed numerical study for one-dimensional 
systems. Table |2] shows a comparison of the computational time between 
the MH and the two-level CGMC method for a lattice N = 16 x 16 and 
coarsening parameter Q = q x q, for q = 4 and q = 8. The results depict 
a reduce of computational effort at a rate close to 0(Q). This rate is not 
exactly 0(Q) because of the additional computational time necessary for 
implementing the local uniform reconstruction. The hysteresis diagram is 
indicated in Figure El for the average total coverage <c> first upon increasing 
the field h from low values and then decreasing it from high values. The total 
coverage is c(er) = iV" 1 J2 X a ( x ) anc ^ avera g e quantities <c> are computed 
after equilibration in the Monte Carlo sampling, using 1000 samples. The 
tests demonstrate that the two-level CGMC method predicts correctly the 
phase transition regime, compared to the conventional MH, with a reduced 
computational cost. 

5.2. Kac type interactions 

Known error estimates indicate that the potential decay is one of the 
parameters controlling the approximation error in coarse graining techniques. 
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Table 2: CPU cost comparisons for different resolutions in hysteresis simulations. K = 
1, J = 5, h G [0, 6], N = 16x16. 



Method 


CPU (min) 


MH q=l 


327 


Two-level CGMC 


q=4 28 


Two-level CGMC 


q=8 9 




Figure 2: Comparison of hysteresis of the microscopic with the two-level process for 
modelO N = 64 x 64, q = 4, K = 1, J = 5. 

The effect of potential singularities on the coarsening error and improving 
strategies encountering multi-body interactions and/or potential splitting has 
been studied in (ij. The example tested in this section is chosen to study 
the potential splitting strategy with the proposed method, that is shown to 
improve direct CGMC results. We consider a long range Kac type interaction 
potential J(r) = N^V^r /N), r = \x — y\ > 0,x,y G with an algebraic 
decay 

V(r) = -^, ifr>0, 

where constant v is chosen to ensure the conservation of the total mass Jo — 
/ J{r)dr. The potential splitting strategy is applied by decomposing the 
interacting potential into a short-range J s (r), as well as a long-range J/(r), 
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defined by 




J(r) for < r < S, 
for S < r, 



and 



J(r)-J s (r). 



This splitting defines the energy of teh system in the form H^(a) = H s (cx) + 
Hi(a) as in ffTTT) with — y) — J s (\x — y\) and J(a; — y) = Ji(\x — y\). In 
Figure [3] and Table [3] we present simulation results in the canonical ensem- 
ble derived from the three methods, the conventional MH with interaction 
potential J(\x — y\),x,y G A^v and, the two-level CGMC with interactions 
Jiik, l),k,l G A M and J s (\x — y\), sampling on the microscopic space E, and 
the CGMC with interaction potential J sampling on the coarse space E^- It 
is observed that, for a fixed coarsening parameter Q, the two-level method 
reduces the coarse graining error of CGMC simulations, since the most sin- 
gular part of J{r), J s (r), is treated in the microscopic space and coarsening 
is applied only to its fast decaying part Ji(r), Figure [3) Figure H] demon- 
strates the increase of acceptance rate of Metropolis sampling when viewing 
the proposed method as an improving the propoposed samples. Overall, as 
is stated in Theorem [2j the method's acceptance rate is comparable to the 
conventional method. Additionally, a reduction of the computational time 
compared to the conventional sampling method is achieved. The error ap- 



Figure 3: Comparison of hysteresis of the microscopic with the compressed (CGMC) and 
the two-level (ML CGMC) process for model E21 PJo = 6, N = 512, q = 8, S = 1. 
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(a) (b) 

Figure 4: (a) The second step ML CGMC and the conventional MH average accep- 
tance probabilities in hysteresis simulations, Figure [3] showing increase of acceptances in 
a Metropolis - Hastings method, (b) Comparison of the total ML CGMC and the conven- 
tional MH average acceptance showing equivalent equilibration rates. 

pearing in Table [3] is the I 2 distance of the average total coverage <c>, of 
the indicated method, from the conventional MH result, as a function of the 
external field h. 

Table 3: CPU cost and error comparisons for different resolutions and methods. N = 
512 , /3 J = 6, S = 1. 



Method 




CPU time 


Error 


MH q=l 




252 





CGMC q=8 




47 


0.78 


Two-level CGMC 


q=8 


57 


0.65 


CGMC q=64 




5 


2.10 


Two-level CGMC 


q=64 


6 


0.81 



6. Nanopattern formation in heteroepitaxy — Microcanonical en- 
semble 

We study a nanopattern formation problem and show how the proposed 
method can provide microscopic information, benefiting from the low com- 
putational cost of the coarse graining technique while refining the error in- 
troduced. The system studied here is characterized by the interplay of short 
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ranged attraction and long ranged repulsion interactions between particles. 
Such an interplay can lead to the formation of patterns, such as discs and 
stripes appearing for example in heteroepitaxy. 

The energy of the system is given by /fjv(c), as in (j2J), with isotropic 
interaction potential 

J(r) = Jo {e^ r/ra)2 - xe~ (r/rv)2 J , r = \x - y\ > 0, x, y G A N . 

Jo is the strength of the potential, r a and ry are the dimensionless length 
scales of attraction and repulsion, respectively, and x is the repulsion strength 
parameter. A study of the kinetic phase diagrams and application of CGMC 
methods for this system is given in (5)]. The underlying dynamics describ- 
ing the surface diffusion of particles considered are spin-exchange, obeying 
the exclusion principle and conserving the order parameter microcanonically. 
The order parameter here is the total coverage Co = N" 1 X^eA a ( x ) repre- 
senting the number of occupied sites on the lattice. 

In the tables and figures following we provide simulation results from three 
methods, the conventional MH, the CGMC and the two-level CGMC. For 
the latest method two approaches were tested, the sampling with correction 
terms, Figures [6] -[7] and Tables 0]- [5] and the approximate CG with potential 
splitting, Figure[6j Figure |5] shows a graph of the interaction potential J(\x — 
y\) and J c (\x — y\) = J(\x — y\) — J(\x — y\) appearing in the second step 
of the two-level method, representing the correction of compressing at the 
first step of the method. J(\x — y\) = J(k, I), x, y e with x G Ck, y G C\ 
is the compressed potential, used for the coarse space simulations. In all 
simulations presented in the sequel the range of pure attractive and repulsive 
forces are r a = 4.47, r r = 10 respectively with repulsion strength x = 0.1. 
Since the interactions follow an exponential decay, in implementations we 
use a cut-off range L for the potential J(r) such that J(r) < 10 -6 for r > L, 
which accounts on (2L) 2 interactions for each site. 

For total coverage Co = 0.9, that leads to pattern composed by periodic 
inverted discs, Table H] gives the average discs diameter and computational 
time for different values of lattice size N, obtained with the two-level CGMC 
method with the corrections strategy. The results verify that the proposed 
method provides the expected behaviour of the system for a finite lattice 
as its size increases, that is features properties stabilize. Features statistics 
are calculated with the use of edge detection techniques of image processing 



32|. In order to demonstrate the effect of the statistical errors we present 



26 



Figure 5: Correcting compressed interactions. 

confidence interval estimates. Such intervals are obtained after transform- 
ing the simulations data to follow an approximately normal distribution. In 
simulations for all methods the number of MC iterations is 5 x 10 7 . For 



Table 4: Finite system size behaviour of the two-level CGMC method. Average discs 
diameter <d> and computational time, cq = 0.9, (3 Jo = 0.6 q — 8, L = 24. 



Lattice size 


Diameter 


Std 


Confidence Interval 


CPU time(min) 


128x128 


10.2 


3.6 


[5.7, 13.3] 


30 


256x256 


9.5 


2.5 


[8.7, 9.9] 


43 


512x512 


8.8 


3.4 


[8.5, 9.1] 


66 


1024x1024 


8.6 


3.6 


[8.5, 8.7] 


110 


MH method 


512x512 


8.27 


1.2 


[8.1, 8.3] 


805 



the sake of comparison in Table H] we present also results obtained with the 
MH method for a iV = 512x512 lattice. For the same lattice the CGMC 
method was tested with coarse graining parameter q = 8 that shows signifi- 
cant deviations, the average diameter being <d> ~ qx<d> = 8x7.8. This 
proves that we over- coarsened, on the other hand for smaller values of the 
parameter Q CGMC method provides better approximations, for example 
when q = 2 <d> = 4.9 and <d> ~ 2x4.9, but the computational time, being 
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187 minutes, is still relatively large. 

Figures [6] and [7J show the averaged equilibrium conformations for two 
total coverage values Co = 0.9 and Co = 0.5, leading to the formation of 
inverted discs and stripes respectively, for a iV = 256x256 lattice. Black and 
white dots indicate occupied and vacant sites respectively. CGMC sampling 
is fast (i.e. 7.3 minutes for q — 8) but with large deviations from the expected 
pattern are significant, while the two-level CGMC refines that error providing 
correct feature scales, of course with the cost of increasing the computational, 
see Figures [7b] and [Tel On the other hand there is a substantial reduce on 
computational time compared to the MH method, from 365 to 25 minutes. 
For a smaller coarse graining parameter, q = 4 in Figure [7d[ the CGMC 
prediction is better albeit at a higher computational cost where again and 
the coarsening error is improved by the proposed method and microscopic 
details are available. 




(a) MH (b) CGMC, q=8 (c) Two-level, q=8 
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(d) CGMC, q=4 (e) Two-level, q=4 

Figure 6: The two-level method with correction is independent of the coarsening parame- 
ter. Inverted discs c = 0.9, /3J = 0.6, N = 256x256. 



Table [5] shows a comparison of the discs statistics and computational 
time for the conventional and the two-level method for various coarse grain- 
ing parameters. The two-level method estimates are close to the MH even 
for high values of the coarse parameter, with a significant reduction of the 
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(a) MH 



(b) CGMC, q=8 



(c) Two-level, q=8 





(d) CGMC, q=4 



(e) Two-level, q=4 



Figure 7: The two-level method captures the correct feature scaling even when CGMC is 
inaccurate. Labyrinths cq = 0.5, /3Jq — 0.2, N = 256x256. 



computational time. 



Table 5: Two-level CGMC with correction, N = 256 x 256 



Method Diameter <d> Std CPU time (min) 



MH 


8.7 


1.3 


252 


Two-level q=2 


8.4 


1.6 


66 


Two-level q=4 


7.9 


4.6 


22 


Two-level q=8 


8.9 


3.2 


23 



Figures [8] present results of simulations with the approximate splitting 
approach where we split the interactions up to range S and neglect the cor- 
rection terms at the second step, see Section 13.11 This strategy captures the 
qualitative picture but misses the characteristic length. From the simulations 
we tested we conclude that this approach is not suggested since the splitting 
disturbs significuntly the relative strength of attractive and repulsive forces 
that has an impact on characteristic length scales of the features. 
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(a) MH (b) Splitting 

Figure 8: Labyrinths. Coupled method with splitting potential, q — 8, s = 4. Splitting 
captures the qualitative picture but misses the characteristic scale. 

7. Conclusions 

In this work we propose the multilevel CGMC method and study its ef- 
ficiency theoretically and numerically. The hierarchy of CGMC methods in- 
troducing multilevel decompositions are shown to reduce the computational 
complexity of conventional methods. The reconstructing measures, imple- 
mented by an accept/reject method, provide the correct inverse mapping 
from coarse to fine resolutions refining CGMC sampling. With Theorem [2] 
we prove that in terms of speed of convergence the two-level CGMC is com- 
parable to a conventional MH, that is controlled by the coarse approximating 
measure fl^^dr]). In general the method's mixing time cannot be better than 
a conventional method, but the reduction of computational effort with the er- 
ror control of the coarse-graining approximation make it an adequate method 
for studying high dimensional systems. We successfully apply the method 
in a nano-pattern discovery problem, verifying that the multilevel CGMC 
method provides the expected behaviour of the system and microscopic in- 
formation, even when CGMC is inaccurate. 
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Appendix A. Proofs 

Before studying the proposed method's mathematical properties we need 
to introduce some definitions and theoretical facts [33j. Let {X n } be a 
Markov chain on space E with transition kernel /C. 

Definition 1. i) A transition kernel K, has the stationary measure fi if 

/ /C(cr, a')ii{da') = /x(cr), for all a G £ . 
is 
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ii) /C is called reversible with respect to \x if 

{g, /C/% = {Kg, /)„, for all g, f E L 2 {p) . 



where (g, /) M = J s g(a)f(a)p(da), g denoting the complex conjugate of g 
and K,g(a) = J s ]C(a,da')g(a'),Wa G £. A sufficient condition for /i being a 
stationary measure of /C that is often easy to check is the detailed balance 
(DB) condition. 

Definition 2. (Detailed Balance) A Markov chain with transition kernel 
K. satisfies the detailed balance condition if there exists a function f satisfying 

K(a,a')f(a) = )C(a' ,a)f(a'), for all a,a' G S . (A.l) 

Here we continue with the proof of Theorem [TJ 

Proof, i) Let a 7^ a', recalling the definition of transition kernel K.cg{o~, o~') 
( ITSj) we have 

Kcg(ct, o-')n(a) = a f (a,a')ac G (Ta,Ta')p r (a'\Ta')p(Ta,Ta')p(a) 

f p{a')^{Ta)pM^) \ 
I A*(o")/2(°)(Tcr')// r .((7 / |T(7 / ) J 

f u,W(Ta')p(Ta'Ta)} , , tm „ /m m A , N 

. M (a)/<( )(T</) ft .(a'|Ta') 
mm < 1, 



' //(crO/i( )(Tff)/Xr(ff|T<7) 

ii) follows from i). Detailed balance with p(a) is sufficient to guarantee 
that u(a) is the stationary distribution of kernel JCcg^, &'), (Theorem 6.46 
in Q). 

raj To prove that chain {Y n } is \x- irreducible we need to prove Kcg(&, A) > 0, 
for all a G E and A measurable such that p(A) > 0. We have 

Kcg(ct,A)= / K, C g{ct, a') da' > / /C C g(o", o"')dcr' = 

= / af(a,a')acG(Ta,Ta')p r (a'\Ta')p(Ta,Ta')da' . 

JA-\a\ 
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From assumptions on p(r},rf) and p r (a\rj) term p r (a'\T(r')p(T<j, To 7 ) is pos- 
itive for all a, a' G E. Also since A C E and E C supp(po), otf{a,a') and 
acG(To", Ta') are positive. These ensure that JCcg(&, A) > 0. 
iv) A sufficient condition ensuring that {Y n } is aperiodic is that JC(a, {a}) > 
for some a G E, that means the event Y n+ i = Y n happens with positive 
probability. We have 

Kcg{°, {<?}) = 1 - / a f (a, a')a CG (Ta, Ta')p r (a'\Ta')p(Ta, Ta')da' . 

If for all a G S, /CcGr(cr, {°"}) = then 

/ a/O, a')a CG {Ta, Ta')pL r {a'\Ta')p{Ta, Ta')da' = 1 , 

which means that af(a,a') = 1 and acG^Ter, Ter') = 1 for almost all a G 
{a G £ : p(Ta, Ta')/! r (V |TV) > 0, for some a' G E}. This would mean 
that the reconstructed proposal kernel p(Ta, Ta')p r (a'\Tcr') is sampling from 
the exact target measure p which in general is not true. Therefore there exists 
a G E such that JCcgW, {a}) = 0. □ 



Appendix B. Benchmark example calculations 

The detailed calculations of the application of Theorem [2] for the bench- 
mark example described in Section 15.11 are provided in detail here. 

The conventional MH algorithm is described by a transition kernel propos- 
ing a spin-flip at the site x with the probability l/N, that is p(a,a') = 
N^ 1 XLeA < H cr ' — w ith acceptance probability 



Ma, a = mm 



1 e (l-2a( X ))[K(<r( x -l)+a(x+l))+± <x(y)] 



where, for simplicity, we consider the case of zero external field h. In the 
two-level CGMC algorithm the proposal probability distribution is 



M 

k£A M 



where r] k± (l) = rj(l),l ^ k and r) k± (k) = r](k) ± 1 and k± denotes the 
adsorption (+) or desorption (— ) in the cell k. The acceptance probability 
of the first level is 



a CG (v,V h± ) = min{l,e- Afc± ^} 
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where Ak+Hfa) = -^E; e A M ^(0 and A k -Hi(rj) = -+ [l - J2ieK ¥ vQ)] ■ 
The reconstruction probability distribution is a uniform distribution in each 
cell jj, r {a x \r] k± ) = YlieK ( a Ci\ r i k± Q'))- Note that at each MC iteration the 
change in the new state rf happens in cell k, thus we need to perform the 
reconstruction of a x only in the cell k, according to 

= - *+) + - *-) ■ 

According to Algorithm [21 the second level acceptance probability is 

a f (a,a x ) = min {l, e~ A * H ^} = min {l, e ^(i-M*))M-i)+<^+i)) } . 
Term B(a, a') (j2l|) is equal to one for all a, a' G Ejy. Indeed for rf = r] k+ and 

x e C k c A N , 

PKViV )Pr{cr \V ^) _ m q Q- V (k) = 1 
Similarly for rj' = t] k ~, 

p(V,V k ~)Pr(v X \V k ~) = Tl ri(k) = 

Therefore its upper and lower bounds are 7 = 7 = 1. We consider the 
splitting approach, and use that H^{a) — Hi(rj) = H s (a) and Hi(a) = Hi(rj). 
The for any x G such that x G Ck, k G Am we have 

if (a, a x ) G d 
if {a, a x ) G C 2 
if (a,a x )eC 3 
if (a, a x ) G C 4 

Set C 4 = according to the following argument. Let (cr, o^) G C 4 such 
that = 1, otf = 1 and a < l.The first two relations are equivalent to 
A k Hi(rj) < and A x H s (a) < that imply A x H N (a) < 0, since H N (a) = 
H s (a) + Hi(r]), thus a = 1, a contradiction. Analogous argument holds for 
the case a — 1, ace < 1, a/ < 1 that proves C 4 = 0. 



* 

min{e" /3AfcJ?;(r ' ) , e^*^^}, 
minle-^^^ie^^M}', 

m j n | e -/3A a; H J v(a)^ e (3A x H N (cr) 1 
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Let us consider K ^ 0. Using the analytic expression of A x H s (cr) and 
A k Hi(rj) we have, for (a, a x ) G C 2 

Similarly for (a, a x ) G C 3 

min | e -/3A,_^ W)e /3A fc+J ff i(j? )| > e -|^(JV-i)| > e -|j| and 
minje-^^^^e^^^} > e" 21 ^ 1 . 

Therefore inf ^ A{a, <J X ) = A(a,a x )\ {rT .^ x)=iyxeAN} = min{e _|J| , e" 21 ^ 1 } . 

When K = then C 2 = 0, that can be proved with simple arguments, and 
for (a, a x ) G C 3 A(a, a x ) = 1 since A x H s (a) = 0. Proving that A(a, a x ) = 1, 
for all x G Atv and a <ET, N . 
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